智能论文笔记

Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning

Kaiyou Song , Shan Zhang , Zihao An , Zimeng Luo , Tong Wang , Jin Xie

分类：计算机视觉

2022-12-13

In contrastive self-supervised learning, the common way to learn discriminative representation is to pull different augmented "views" of the same image closer while pushing all other images further apart, which has been proven to be effective. However, it is unavoidable to construct undesirable views containing different semantic concepts during the augmentation procedure. It would damage the semantic consistency of representation to pull these augmentations closer in the feature space indiscriminately. In this study, we introduce feature-level augmentation and propose a novel semantics-consistent feature search (SCFS) method to mitigate this negative effect. The main idea of SCFS is to adaptively search semantics-consistent features to enhance the contrast between semantics-consistent regions in different augmentations. Thus, the trained model can learn to focus on meaningful object regions, improving the semantic representation ability. Extensive experiments conducted on different datasets and tasks demonstrate that SCFS effectively improves the performance of self-supervised learning and achieves state-of-the-art performance on different downstream tasks.

translated by 谷歌翻译

Interdisciplinary Discovery of Nanomaterials Based on Convolutional Neural Networks

Tong Xie , Yuwei Wan , Weijian Li , Qingyuan Linghu , Shaozhou Wang , Yalun Cai , Han Liu , Chunyu Kit , Clara Grazian , Bram Hoex

分类：机器学习

2022-12-06

The material science literature contains up-to-date and comprehensive scientific knowledge of materials. However, their content is unstructured and diverse, resulting in a significant gap in providing sufficient information for material design and synthesis. To this end, we used natural language processing (NLP) and computer vision (CV) techniques based on convolutional neural networks (CNN) to discover valuable experimental-based information about nanomaterials and synthesis methods in energy-material-related publications. Our first system, TextMaster, extracts opinions from texts and classifies them into challenges and opportunities, achieving 94% and 92% accuracy, respectively. Our second system, GraphMaster, realizes data extraction of tables and figures from publications with 98.3\% classification accuracy and 4.3% data extraction mean square error. Our results show that these systems could assess the suitability of materials for a certain application by evaluation of synthesis insights and case analysis with detailed references. This work offers a fresh perspective on mining knowledge from scientific literature, providing a wide swatch to accelerate nanomaterial research through CNN.

translated by 谷歌翻译

Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout

Hongjun Wang , Jiyuan Chen , Tong Pan , Zipei Fan , Boyuan Zhang , Renhe Jiang , Lingyu Zhang , Yi Xie , Zhongyi Wang , Xuan Song

分类：机器学习 | 人工智能

2022-11-28

Spatial-temporal (ST) graph modeling, such as traffic speed forecasting and taxi demand prediction, is an important task in deep learning area. However, for the nodes in graph, their ST patterns can vary greatly in difficulties for modeling, owning to the heterogeneous nature of ST data. We argue that unveiling the nodes to the model in a meaningful order, from easy to complex, can provide performance improvements over traditional training procedure. The idea has its root in Curriculum Learning which suggests in the early stage of training models can be sensitive to noise and difficult samples. In this paper, we propose ST-Curriculum Dropout, a novel and easy-to-implement strategy for spatial-temporal graph modeling. Specifically, we evaluate the learning difficulty of each node in high-level feature space and drop those difficult ones out to ensure the model only needs to handle fundamental ST relations at the beginning, before gradually moving to hard ones. Our strategy can be applied to any canonical deep learning architecture without extra trainable parameters, and extensive experiments on a wide range of datasets are conducted to illustrate that, by controlling the difficulty level of ST relations as the training progresses, the model is able to capture better representation of the data and thus yields better generalization.

translated by 谷歌翻译

Comparison-based Conversational Recommender System with Relative Bandit Feedback

Zhihui Xie , Tong Yu , Canzhe Zhao , Shuai Li

分类：机器学习

2022-08-21

随着对话建议的最新进展，推荐系统能够通过对话互动积极而动态地引起用户偏好。为此，系统会定期查询用户对属性的偏好并收集其反馈。但是，大多数现有的对话推荐系统仅使用户能够提供对属性的绝对反馈。实际上，绝对反馈通常受到限制，因为用户在表达偏好时倾向于提供偏见的反馈。取而代之的是，由于用户偏好是固有的相对，因此用户通常更倾向于表达比较偏好。为了使用户能够在对话互动期间提供比较偏好，我们提出了一种基于比较的对话推荐系统。相对反馈虽然更实用，但并不容易合并，因为其反馈量表总是与用户的绝对偏好不匹配。通过有效地收集和了解交互式方式的相对反馈，我们进一步提出了一种新的Bandit算法，我们称之为RelativeConucb。与对话式推荐系统中的现有Bandit算法相比，合成和现实数据集的实验验证了我们提出的方法的优势。

translated by 谷歌翻译

Distributed Learning of Neural Lyapunov Functions for Large-Scale Networked Dissipative Systems

Amit Jena , Tong Huang , S. Sivaranjani , Dileep Kalathil , Le Xie

分类：机器学习

2022-07-15

本文考虑了以分布式和计算障碍方式组成的大规模网络系统的稳定区域的问题。估计一般非线性系统稳定区域的一种标准方法是首先找到该系统的Lyapunov函数，并将其吸引区域描述为稳定区域。但是，用于查找lyapunov函数的经典方法，例如平方的方法和二次近似，要么不扩展到大型系统，要么对稳定区域进行非常保守的估计。在这种情况下，我们通过利用子系统的耗散性结构来提出一种新的基于分布式学习的方法。我们的方法有两个部分：第一部分是一种分布式方法，用于学习所有子系统的存储功能（类似于Lyapunov函数），第二部分是一种分布式优化方法，可以使用该系统找到网络系统的Lyapunov功能学习子系统的存储功能。我们通过微电网网络中的广泛案例研究证明了我们提出的方法的出色表现。

translated by 谷歌翻译

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

Tao Qi , Fangzhao Wu , Chuhan Wu , Lingjuan Lyu , Tong Xu , Zhongliang Yang , Yongfeng Huang , Xing Xie

分类：机器学习

2022-06-07

垂直联合学习（VFL）是一种隐私的机器学习范式，可以从以隐私性的方式从不同平台上分布的功能学习模型。由于在实际应用程序中，数据可能包含对公平敏感特征（例如性别）的偏见，因此VFL模型可能会从培训数据中继承偏见，并对某些用户组变得不公平。但是，现有的公平ML方法通常依赖于对公平敏感特征的集中存储来实现模型公平，通常在联合场景中不适用。在本文中，我们提出了一个公平的垂直联合学习框架（FAIRVFL），可以改善VFL模型的公平性。 FAIRVFL的核心思想是根据分散的特征字段以隐私的方式学习样本的统一和公平表示。具体而言，每个具有不敏感功能的平台首先从本地功能中学习本地数据表示。然后，将这些本地表示形式上传到服务器，并将其汇总到目标任务的统一表示形式中。为了学习公平的统一表示形式，我们将它们发送到每个平台存储公平性敏感的功能，并应用对抗性学习，以从偏见的数据继承的统一表示形式中消除偏见。此外，为了保护用户隐私，我们进一步提出了一种对抗性对手学习方法，以从服务器中的统一表示形式中删除隐私信息，然后再将其发送到保持对公平敏感功能的平台。在两个现实世界数据集上进行的实验验证了我们的方法可以通过用户隐私受到良好保护有效地改善模型公平性。

translated by 谷歌翻译

Non-Parametric Domain Adaptation for End-to-End Speech Translation

Yichao Du , Weizhi Wang , Zhirui Zhang , Boxing Chen , Tong Xu , Jun Xie , Enhong Chen

分类：自然语言处理 | 人工智能

2022-05-23

端到端语音翻译（E2E-ST）由于其误差传播的潜力，较低的延迟和较少的参数而受到了越来越多的关注。但是，基于神经的方法对该任务的有效性受到可用培训语料库的严重限制，尤其是对于较少或不存在的域中三重障碍培训数据的领域适应性。在本文中，我们提出了一种新型的非参数方法，该方法利用特定于域的文本翻译语料库来实现E2E-ST系统的域适应性。为此，我们首先将一个附加的编码器纳入预先训练的E2E-ST模型中，以实现文本翻译建模，然后通过减少可用三重态训练数据中的通讯表示不匹配来统一解码器的输出表示形式，以实现文本和语音翻译任务。在域适应过程中，引入了K-Nearest-neighbor（KNN）分类器，以使用由域特异性文本翻译语料库构建的外部数据存储器生成最终的翻译分布，而采用通用输出表示来执行相似性搜索。 Europarl-St基准的实验表明，仅涉及内域文本翻译数据时，我们提出的方法在所有翻译方向上平均将基线显着提高了基线，即使表现出强大的强度内域微调方法。

translated by 谷歌翻译

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

Zhiqi Li , Wenhai Wang , Hongyang Li , Enze Xie , Chonghao Sima , Tong Lu , Qiao Yu , Jifeng Dai

分类：计算机视觉

2022-03-31

3D视觉感知任务，包括基于多相机图像的3D检测和MAP分割，对于自主驾驶系统至关重要。在这项工作中，我们提出了一个称为BeVformer的新框架，该框架以时空变压器学习统一的BEV表示，以支持多个自主驾驶感知任务。简而言之，Bevormer通过通过预定义的网格形BEV查询与空间和时间空间进行交互来利用空间和时间信息。为了汇总空间信息，我们设计了空间交叉注意，每个BEV查询都从相机视图中从感兴趣的区域提取了空间特征。对于时间信息，我们提出暂时的自我注意力，以将历史bev信息偶尔融合。我们的方法在Nuscenes \ texttt {test} set上，以NDS度量为单位达到了新的最新56.9 \％，该设置比以前的最佳艺术高9.0分，并且与基于LIDAR的盆地的性能相当。我们进一步表明，BeVormer明显提高了速度估计的准确性和在低可见性条件下对象的回忆。该代码可在\ url {https://github.com/zhiqi-li/bevformer}中获得。

translated by 谷歌翻译

Direct Molecular Conformation Generation

Jinhua Zhu , Yingce Xia , Chang Liu , Lijun Wu , Shufang Xie , Yusong Wang , Tong Wang , Tao Qin , Wengang Zhou , Houqiang Li

分类：人工智能

2022-02-03

Molecular conformation generation aims to generate three-dimensional coordinates of all the atoms in a molecule and is an important task in bioinformatics and pharmacology. Previous methods usually first predict the interatomic distances, the gradients of interatomic distances or the local structures (e.g., torsion angles) of a molecule, and then reconstruct its 3D conformation. How to directly generate the conformation without the above intermediate values is not fully explored. In this work, we propose a method that directly predicts the coordinates of atoms: (1) the loss function is invariant to roto-translation of coordinates and permutation of symmetric atoms; (2) the newly proposed model adaptively aggregates the bond and atom information and iteratively refines the coordinates of the generated conformation. Our method achieves the best results on GEOM-QM9 and GEOM-Drugs datasets. Further analysis shows that our generated conformations have closer properties (e.g., HOMO-LUMO gap) with the groundtruth conformations. In addition, our method improves molecular docking by providing better initial conformations. All the results demonstrate the effectiveness of our method and the great potential of the direct approach. The code is released at https://github.com/DirectMolecularConfGen/DMCG

translated by 谷歌翻译

Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement

Yichao Du , Zhirui Zhang , Weizhi Wang , Boxing Chen , Jun Xie , Tong Xu

分类：自然语言处理

2021-12-21

由于其误差传播，延迟较少和更少的参数较少的潜力，端到端语音到文本翻译〜（e2e-st）变得越来越受欢迎。鉴于三联培训语料库$ \ langle演讲，转录，翻译\ rangle $，传统的高质量E2E-ST系统利用$ \ langle演讲，转录\ rangle $配对预先培训模型，然后利用$ \ Langle演讲，翻译\ rangle $配对进一步优化它。然而，该过程仅涉及每个阶段的两个元组数据，并且该松散耦合不能完全利用三重态数据之间的关联。在本文中，我们试图基于语音输入模拟转录和翻译的联合概率，以直接利用这种三重态数据。基于此，我们提出了一种新的正规化方法，用于改进三重态数据中双路分解协议的模型培训，理论上应该是相等的。为实现这一目标，我们将两个Kullback-Leibler发散正规化术语介绍到模型培训目的中，以减少双路径输出概率之间的不匹配。然后，训练有素的模型可以通过预定义的早期停止标签自然地被视为E2E-ST模型。 Must-C基准测试的实验表明，我们所提出的方法在所有8个语言对上显着优于最先进的E2E-ST基线，同时在自动语音识别任务中实现更好的性能。我们的代码在https://github.com/duyichao/e2e -st-tda开放。

translated by 谷歌翻译